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Abstract: We analyze in this paper the longest increasing contiguous sequence or maximal 
ascending run of random variables with common uniform distribution but not independent. 
Their dependence is characterized by the fact that two successive random variables cannot take 
the same value. Using a Markov chain approach, we study the distribution of the maximal 
ascending run and we develop an algorithm to compute it. This problem comes from the 
analysis of several self-organizing protocols designed for large-scale wireless sensor networks, 
and we show how our results apply to this domain. 

Key-words: Markov chains, maximal ascending run, self-stabilization, convergence time. 



* INRIA Lille - Nord Europe/LIP6(USTL,CNRS), nathalie.mitton@inria.fr 
t Universite de Franche-Comte, katy.paroux@univ-fcomte.fr 
•f INRIA Rennes - Bretagne Atlantique, bruno.sericola@inria.fr 
§ INRIA Saclay - Ile-de-France/LIP6, scbasticn.tixeuil@lri.fr 



Centre de recherche INRIA Rennes - Bretagne Atlantique 
IRISA, Campus universitaire de Beaulieu, 35042 Rennes Cedex 

Telephone : +33 2 99 84 71 00 — Telecopie : +33 2 99 84 71 71 



Sous-suites croissantes contigues de variables aleatoires 
dependantes uniformement distributes: application aux 

reseaux sans fil 

Resume : Nous analysons dans cet article la plus longue sous-suite croissante contigue d'une 
suite de variables aleatoires de meme distribution uniforme mais non independantes. Leur 
dependance est caracterisee par le fait que deux variables successives ne peuvent prendre la 
meme valeur. En utilisant une approche markovienne, nous etudions la distribution de la plus 
longue sous-suite croissante contigue et nous developpons un algorithme pour la calculer. Ce 
probleme provient de l'analyse de plusieurs protocoles auto-organisants pour les reseaux de 
capteurs sans fil a grande echelle, et nous montrons comment nos resultats s'appliquent a ce 
domaine. 

Mots-cles : Chaines de Markov, sous-suites croissantes contigues, auto-stabilisation, temps 
de convergence. 
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1 Introduction 

Let X = (X n ) n ^>i be a sequence of identically distributed random variables on the set S = 
{1, . . . , to}. As in [8], we define an ascending run as a contiguous and increasing subse- 
quence in the process X. For instance, with m = 5, among the 20 first following values of 
X: 23124342313451234341, there are 8 ascending runs and the length of maximal ascending 
run is 4. More formally, an ascending run of length £ ^ 1, starting at position k ^ 1, is a 
subsequence (Xk, Xk+\, . . . , Xk+e~i) such that 

Xk-i > Xk < Xk+i < • ■ ■ < Xk+e-i > Xk+e, 

where we set X = oo in order to avoid special cases at the boundary. Under the assumption 
that the distribution is discrete and the random variables are independent, several authors 
have studied the behaviour of the maximal ascending run, as well as the longest non- decreasing 
contiguous subsequence. The main results concern the asymptotic behaviour of these quantities 
when the number of random variables tends to infinity, see for example [6] and [4] and the 
references therein. Note that these two notions coincide when the common distribution is 
continuous. In this case, the asymptotic behaviour is known and does not depend on the 
distribution, as shown in [6]. 

We denote by M n the length of the maximal ascending run among the first n random 
variables. The asymptotic behaviour of M n hardly depends on the common distribution of the 
random variables Xk, h ^ 1. Some results have been established for the geometric distribution 
in [10] where an equivalent of the law of M n is provided and previously in [1] where the almost- 
sure convergence is studied, as well as for Poisson distribution. 

In [9], the case of the uniform distribution on the set {1, . . . , s} is investigated. The au- 
thor considers the problem of the longest non- decreasing contiguous subsequence and gives an 
equivalent of its law when n is large and s is fixed. The asymptotic equivalent of E(M n ) is also 
conjectured. 

In this paper, we consider a sequence X = (X n ) n ^i of integer random variables on the set 
S = {1, . . . , to}, with m ^ 2. The random variable Xi is uniformly distributed on S and, for 
n ^ 2, X n is uniformly distributed on S with the constraint X n ^ X n _i. This process may be 
seen as the drawing of balls, numbered from 1 to to in an urn where at each step the last ball 
drawn is kept outside the urn. Thus we have, for every i, j e S and n ^ 1, 

¥(Xi =i) = - and F(X n = j\X n -i = i) = 

m to — 1 

By induction over n and unconditioning, we get, for every n ^ 1 and i G S, 

F(X n =i) = -. 

m 

Hence the random variables X n are uniformly distributed on S but are not independent. Using 
a Markov chain approach, we study the distribution of the maximal ascending run and we 
develop an algorithm to compute it. This problem comes from the analysis of self-organizing 
protocols designed for large-scale wireless sensor networks, and we show how our results apply 
to this domain. 

The remainder of the paper is organized as follows. In the next section, we use a Markov 
chain approach to study the behavior of the sequence of ascending runs in the process X. In 
Section 3, we analyze the hitting times of an ascending run of fixed length and we obtain the 
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distribution of the maximal ascending M n over the n first random variables X\ , . . . , X n using a 
Markov renewal argument. An algorithm to compute this distribution is developed in Section 4 
and Section 5 is devoted to the practical implications of this work in large-scale wireless sensor 
networks. 



2 Associated Markov chain 

The process X is obviously a Markov chain on S. As observed in [10], we can see the ascending 
runs as a discrete-time process having two components: the value taken by the first element of 
the ascending run and its length. We denote this process by Y = (V k , L k ) k -^i, where V k is the 
value of the first element of the kth ascending run and L k is its length. The state space of Y 
is a subset S 2 we shall precise now. 

Only the first ascending run can start with the value m. Indeed, as soon as k ^ 2, the 
random variable V k takes its values in {1, . . . , to — 1}. Moreover V\ — X± — to implies that 
L\ = 1. Thus, for any £ ^ 2, (to, £) is not a state of Y whereas (to, 1) is only an initial state 
that Y will never visit again. 

We observe also that if V k = 1 then necessarily L k 2, which implies that (1,1) is not a 
state of Y. Moreover V k — % implies that L k ^ to — i + 1. 

According to this behaviour, we have 

Y 1 e E U {(to, 1)} and for k ^2,Y k e E, 

where 

E = {(i, £) | 1 < i ^ m - 1 and 1 < i ^ m - i + 1} \ {(1, 1)}. 
We define the following useful quantities for any i,j,£ € S and k ^ 1 : 

$ e (i,j) = P(V k+1 = j,L k =£\V k = i), (1) 
<p t (t) = P(L k =£\V k = i), (2) 
i/> t (i) = ¥(L k ^£\V k = t). (3) 

Theorem 1. The process Y is a homogeneous Markov chain with transition probability matrix 
P, which entries are given for any (i,£) G E U {(to, 1)} and (j, A) G E by 

p ®e(h3)<P\U) 

-T(i,£),(j,A) — 7TT • 

Proof. We exploit the Markov property of X, rewriting events for Y as events for X. 

For every (j, A) G E and taking k ^ 1 then for any (v k ,£ k ), . . . , (fi,^i) G i?U {(to, 1)}, we 

denote by A k the event : 

A k = {Y k =(v k ,£ k ) i ...,Y 1 = (v 1 ,£ 1 )}. 

We have to check that 

P(Y k+1 = (j,X)\A k ) = P(y 2 = (j, A)|Fi = K,4))- 

First, we observe that 

= {y a = ( Vli h)} = {x 1 = Vl <---< x £l > x h+1 }, 



TMDT A 
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and 

A 2 = {Y 2 = (v 2 ,£ 2 ),Y 1 = (v 1 ,£ 1 )} 

= {Xi = vi < ■ ■ ■ < Xn x > Xg 1+ i = v 2 < ■ ■ ■ < X( 1+ e 2 > Xe 1+ £ 2+ i} 

= AiH {Xe 1+ i = v 2 < • • • < Xg 1+ e 2 > Xe 1+ £ 2+ i}. 

By induction, we obtain 

Ak = A k _i n {^Q(fe-i)+i — v k < ■ ■ ■ < X^ k -) > XiQi) + i}, 

where £(k) — i\ + . . . + £ k . Using this remark and the fact that X is a homogeneous Markov 
chain, we get 

P(y fc+1 = {j, X)\A k ) = F(V k+1 = j, L k+1 = X\A k ) 

= P(^(fc)+i =]<■■■< ^e(k)+\ > Xi(k)+\+i\Xi(k-i)+i — v k < ■ • ■ < -^Q(fc) > Xi(k) + i, Ak-i) 
= P(^(fc)+i =]<•■■< Xi(k)+\ > Xt(k) + \ + i\Xi(k-i)+i = v k < ■ ■ ■ < X^k) > Xt(k)+i) 
= ¥(X ek+1 =j<-..< X ik+X > X £fc+A+1 |X! =v k <-"<X tk > X tk+1 ) 
= F(V 2 = j,L 2 = X\V 1 = v k ,L 1 = £ k ) 
= ¥(Y 2 = (j,X)\Y 1 = (v k J k )). 

We now have to show that 

P(Y fc+1 = (j, X)\Y k = (v k ,£ k )) = F(Y 2 = (j, A) |n = (v k ,£ k )). 

Using the previous result, we have 

F{Y k+1 = (j,\),Y k = (v k ,£ k )) 



p(r fc+1 = (j, x)\Y k = ( Vk ,e k )) 



nY k = (v k ,£ k )) 



k-l 



E E nY k+l = (j,X),Y k = (v k ,£ k ),A k _ 1 ) 
j=i (vi,li)eE 



k-l 

E E ¥ ( Yk = K.4),A fc _i, 

i=i (vi,ii)eE 

k-l 

E E nn + i = (j,A)|A fe )p(A) 

i=l (vi,li)eE 

k—1 

E E p (^) 

1=1 (viJi)eE 
= F(Y 2 = (j,X)\Y l = (v k J k )). 

We have shown that Y is a homogeneous Markov chain over its state space. The entries of 
matrix P are then given, for every (j, X) G E and (i, £) G E U {(m, 1)} by 

P(i,£),(j,X) = P(Vfe+i = j, L k+1 = X\V k = i, L k = £) 

= nVk + i = j\V k = i,L k = £)¥(L k+1 = X\V k+1 = j, V k = i,L k = £) 
= HVk+i = j\V k = i,L k = £)¥(L k+1 = X\V k+1 = j) 
F(V k+1 = X,L k = £\V k = i) 



F(L k = £\V k = t) 
3v(m>a(j) 



■<P\(J) 



1)1) ° C A AO 
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where the third equality follows from the Markov property. 



We give the expressions of <p\(j) and <&e(i,j) for every i,i,£ G S in the following lemma. 
Lemma 2. For every i,j,£ G S, we have 



Proof. For every i, j, £ G S, it is easily checked that $e(i, j) = if m < i+£—l. If m ^ i+£ — l, 
we have 



We introduce the sets Gi(i,j,£,m), G 2 (i, j,£,m), G(i,£,m) and H(£,m) defined by 
Gi(i,j,t,m) = {(ar 2 , • • ■ ,xt+i) e {z+ 1, . . . ,m}' ; x 2 < ■ ■ ■ < x £ 7^ x m = j}, 

G 2 (i,j,£,m) = {(x 2 , . . -,x e+1 ) G {i+ 1, . . . ,m} £ ; x 2 < • ■ ■ < x £ = = j}, 
G(z, m) = {(x 2 , . . . , x e ) G {i + 1, . . . , m}^" 1 ; x 2 < ■ ■ ■ < x e }, 
H(£, m) = {(x 2 , . . . , x e+1 ) G {1, . . . , m\ l ; i ^ x 2 ^ ■ ■ ■ ^ x e+1 }. 
It is well-known, see for instance [5], that 



Since \G 2 (i,j,£,m)\ = \G(i,£— l,j — 1)|, the first term in (4) can be written as 




F(V 2 = j,L 1 =£\V 1 = i) 

P(« < X 2 < . . . < X e > X e+1 = = 1) 

P(z < X 2 < . . . < X e , X £+1 = j\X x = i) 

-P(i < X 2 < . . . < X e < X e+1 = j\X x = i)l {j>i+e -i } . 




P(z <X 2 < 



<Xe,X e+1 =j\X-_ 



= 1 



\Gi(i,j,£,m)\ 

\H(£,m)\ 
\G(i,£,m)\-\G 2 (i,j,£,m)\ 



\H(£,m)\ 
\G(i,£,m)\-\G(i,e-l,j-l)\ 
\H{£,m)\ 




(m — 1) 
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The second term is given, for j > i + £ — 1, by 

\G(i,£,j-l)\ \ £-1 



[i < x 2 < . . . < X e < X e+l = j X 1 = i) 

\H{£, m)\ (m — l / 



Adding these two terms, we get 

m — A 



m — i\ / j — i — 1\ f j — i — V 



(m-iy 

m — A / j ' — A 

£ _ 1 J l {™-i>t-l} - U _ i J Mi-i^-i} 



(m-iy 

which completes the proof of the first relation. 

The second relation follows from expression (3) by writing 

^(z) = P(Li^^|Vi = i) 

= P(i < X 2 < . . . < X t \X x = i)l {m -i^-i] 
\G(i,£,m)\ 
\H(£-l,m)\ 
(m — A 

(m _ 1)M i {m -^-i } . 

The third relation follows from definition (2) by writing <pe(i) = tpi(i) — if)£+i(i). 
Note that the matrix $ defined by 

m 

$ = ^2&e 

1=1 

is obviously a stochastic matrix, which means that, for every i — 1, . . . , rn, we have 

m 

53 ^(l) = 1. 



£=1 

m 



«=1 j=l 1=1 

3 Hitting times and maximal ascending run 

For every r — 1, . . . , m, we denote by T r the hitting time of an ascending run of length at least 
equal to r. More formally, we have 

T r = inf {k ^ r ; X fe _ r+1 < • • ■ < X k }. 

It is easy to check that we have T\ — 1 and T r ^ r. The distribution of T r is given by the 
following theorem. 
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Theorem 3. For 2 ^ r ^ m, we have 

' if 1 ^ n ^ r - 1 

P(T r ^ n\Vi = i) = < 



r— 1 m 



e=i j=i 



(5) 



Proof. Since T r ^ r, we have, for 1 ^ n ^ r — 1, 

P(T r ^ n|Vi = i) = 

Let us assume from now that n ^ r. Since r implies that T r = r, we get 

P(T r < n, Li ^ r|V[ = i) = P(Li ^ r|Vi = i) = i/j r {i). (6) 

We introduce the random variable Tr defined by hitting time of an ascending run length at 
least equal to r when counting from position p. Thus we have 

= M{k > r ; X p+k „ r <■■■< X p+k ^}. 
We then have T r = T r . Moreover, L\ = £ < r implies that T r = Tr Ll+1 ^ + £, which leads to 

r-l 



P(T r < n, Li < r\V x = i) = J2 V ( T r < n ' L i = £ \ V i = *) 

r-l 

= J2^( T r Ll+1) ^n-£,L 1 =£\V 1 = i) 



r—l m 

= J2H n^ Ll+1) ^n-£,V 2 = j, L x = £\V 1 = i) 

1=1 3=1 
r—l m 

= 1111 Hh i) n^ Ll+1) < n - £\V 2 = j, L x =e,Vi = z) 

1=1 3=1 
r—l m 

= HH Hi, j) P(T r (Ll+1) < n - £\V 2 = j) 
i=i j=i 

r—l m 

= HH Mhj) n% ^ n -£\V 1 =j), (7) 

1=1 j=l 

where the fifth equality follows from the Markov property and the last one from the homogeneity 
of Y. Putting together relations (6) and (7), we obtain 

r—l m 

P(T r ^ n \V x = i) = ^v(z) + HH jMT r Kn-£\V 1 =j). 

1=1 3=1 
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For every n ^ 1, we define M n as the maximal ascending run length over the n first values 
Xi, . . . , X n . We have 1 ^ M n ^ m An and 

M n ^ r <=> T r ^ n, 

which implies 

mAn mAn -. mAn m 

E(M„) = J^P(M n ^ r) = ^P(T r < n) = — ^P(T r ^ n\V x = i). 

r=l r=l r=l i=l 

4 Algorithm 

For r — 1, . . . , m, we denote by ip r the column vector of dimension m which ith entry is ip r (i). 
For r = 1, ... ,m, n^l and ft = 1, . . . , n, we denote by W r ^ the column vector of dimension 
m which ith entry is defined by 

W h , r {i) = P(T r < h\V! =i)= nM h > r\Vi = i), 

and we denote by 1 the column vector of dimension m with all entries equal to 1. An algorithm 
for the computation of the distribution and the expectation of M n is given in Table 1. 

input : m, n 

output : E(M/j) for h — 1, . . . , n. 

for £ — 1 to m do Compute the matrix endfor 

for r — 1 to 771 do Compute the column vectors ip r endfor 

for ft = 1 to n do Wh,\ = 1 endfor 

for r = 2 to m A n do 

for ft, = 1 to r - 1 do W^r = endfor 

r-l 

for ft = r to n do Vt 7 ^ = ip r + <&tWh-t,r endfor 

i=i 

endfor 

. mA/i 

for ft = 1 to n do E(M h ) = — V l*W^ r endfor 

777. ^— ' 
r=l 



Table 1: Algorithm for the distribution and expectation computation of M n . 

5 Application to wireless networks : fast self-organization 

Our analysis has important implications in forecast large-scale wireless networks. In those 
networks, the number of machines involved and the likeliness of fault occurrences prevents 
any centralized planification. Instead, distributed self-organization must be designed to enable 
proper functioning of the network. A useful technique to provide self-organization is self- 
stabilization [2, 3]. Self-stabilization is a versatile technique that can make a wireless network 
withstand any kind of fault and reconfiguration. 

A common drawback with self-stabilizing protocols is that they were not designed to handle 
properly large-scale networks, as the stabilizing time (the maximum amount of time needed to 

I > I > ° a 4 AO 
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recover from any possible disaster) could be related to the actual size of the network. In many 
cases, this high complexity was due to the fact that network-wide unique identifiers are used 
to arbitrate symmetric situations [13]. However, there exists a number of problems appearing 
in wireless networks that need only locally unique identifiers. 

Modeling the network as a graph where nodes represent wireless entities and where edges 
represent the ability to communicate between two entities (because each is within the trans- 
mission range of the other), a local coloring of the nodes at distance d {i.e. having two nodes 
at distance d or less assigned a distinct color) can be enough to solve a wide range of problems. 
For example, local coloring at distance 3 can be used to assign TDMA time slots in an adaptive 
manner [7], and local coloring at distance 2 has successively been used to self-organize a wireless 
network into more manageable clusters [12]. 

In the performance analysis of both schemes, it appears that the overall stabilization time 
is balanced by a tradeoff between the coloring time itself and the stabilization time of the 
protocol using the coloring (denoted in the following as the client protocol). In both cases 
(TDMA assignment and clustering), the stabilization time of the client protocol is related 
to the height of the directed acyclic graph induced by the colors. This DAG is obtained by 
orienting an edge from the node with the highest color to the neighbor with the lowest color. 
As a result, the overall height of this DAG is equal to the longest strictly ascending chain of 
colors across neighboring nodes. Of course, a larger set of colors leads to a shorter stabilization 
time for the coloring (due to the higher chance of picking a fresh color), but yields to a potential 
higher DAG, that could delay the stabilization time of the client protocol. 

In [11], the stabilization time of the coloring protocol was theoretically analyzed while the 
stabilization time of a particular client protocol (the clustering scheme of [12]) was only studied 
by simulation. The analysis performed in this paper gives a theoretical upper bound on the 
stabilization time of all client protocols that use a coloring scheme as an underlying basis. 
Together with the results of [11], our study constitutes a comprehensive analysis of the overall 
stabilization time of a class of self-stabilizing protocols used for the self-organization of wireless 
sensor networks. In the remaining of the section, we provide quantitative results regarding the 
relative importance of the number of used colors with respect to other network parameters. 

Figure 1 shows the expected length of the maximal ascending run over a n-node chain for 
different values of m. 

Results show several interesting behaviors. Indeed, self-organization protocols relying on 
a coloring process achieve better stabilization time when the expected length of maximal as- 
cending run is short but a coloring process stabilizes faster when the number of colors is high 
[11]. 

Figure 1 clearly shows that even if the number of colors is high compared to n (n << m), the 
expected length of maximal ascending run remains short, which is a great advantage. Moreover, 
even if the number of nodes increases, the expected length of the maximal ascending run remains 
short and increases very slowly. This observation demonstrates the scalability properties of a 
protocol relying on a local coloring process since its stabilization time is directly linked to the 
length of this ascending run [11]. 

Figure 2 shows the expected length of maximal ascending run over a n-node chain for 
different values of n. 

Results shows that for a fixed number of nodes n, the expected length of the maximal 
ascending run converges to a finite value, depending of n. This implies that using a large 
number of colors does not impact the stabilization time of the client algorithm. 
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Figure 1: Expected length of the maximal ascending run as a function of the number of nodes. 
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